Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ntuple] add different modes of descriptor loading to RNTupleSerializer #17541

Merged
merged 2 commits into from
Jan 29, 2025

Conversation

silverweed
Copy link
Contributor

The default mode - the same we did so far - is "for reading": this means that we not only load the on-disk information into the in-memory descriptor, but we also fixup the suppressed column ranges and the clusters with the info coming from the header extension (deferred columns); this is a required step for reading the rntuple correctly.
We then add two additional modes: "for writing" and "raw". "for writing" is the same as "for reading" except it doesn't do the header extension fixup - this mode will be used by the RNTupleMerger to properly write out an incrementally-merged RNTuple. "raw" simply deserializes the on-disk information without doing any fixup.

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

@silverweed silverweed requested review from hahnjo and enirolf January 27, 2025 15:13
@silverweed silverweed self-assigned this Jan 27, 2025
@silverweed silverweed requested a review from jblomer as a code owner January 27, 2025 15:13
Copy link

github-actions bot commented Jan 28, 2025

Test Results

    18 files      18 suites   5d 4h 24m 45s ⏱️
 2 683 tests  2 681 ✅ 0 💤 2 ❌
46 598 runs  46 595 ✅ 0 💤 3 ❌

For more details on these failures, see this check.

Results for commit 07f170b.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle looks good to me! Except minor comments, I think we should add a unit test that constructs a descriptor with one suppressed column and one deferred column and exercises the three reconstruction modes.

tree/ntuple/v7/inc/ROOT/RNTupleSerialize.hxx Outdated Show resolved Hide resolved
tree/ntuple/v7/src/RNTupleSerialize.cxx Outdated Show resolved Hide resolved
tree/ntuple/v7/src/RNTupleSerialize.cxx Outdated Show resolved Hide resolved
The default mode - the same we did so far - is "for reading": this means
that we not only load the on-disk information into the in-memory
descriptor, but we also fixup the suppressed column ranges and the
clusters with the info coming from the header extension (deferred
columns); this is a required step for reading the rntuple
correctly.
We then add two additional modes: "for writing" and "raw".
"for writing" is the same as "for reading" except it doesn't do the
header extension fixup - this mode will be used by the RNTupleMerger
to properly write out an incrementally-merged RNTuple.
"raw" simply deserializes the on-disk information without doing any
fixup.
Copy link
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM.

tree/ntuple/v7/test/ntuple_serialize.cxx Show resolved Hide resolved
@silverweed silverweed merged commit 3602362 into root-project:master Jan 29, 2025
17 of 21 checks passed
@silverweed silverweed deleted the ntuple_attach branch January 29, 2025 07:56
hahnjo added a commit to hahnjo/rntuple-apps that referenced this pull request Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants